Example-Based Machine Translation for Low-Resource Language Using Chunk-String Templates

نویسندگان

  • Khan Md. Anwarus Salam
  • Setsuo Yamada
  • Tetsuro Nishino
چکیده

Example-Based Machine Translation (EBMT) for low resource language, like Bengali, has low-coverage issues, due to the lack of parallel corpus. In this paper, we propose an EBMT for low resource language, using chunk-string templates (CSTs) and translating unknown words. CSTs consist of a chunk in source-language, a string in target-language, and word alignment information. CSTs are prepared automatically from aligned parallel corpus and WordNet. To translate unknown words, we used WordNet hypernym tree and English-Bengali dictionary. If no translation candidate found, system transliterates the word. Proposed EBMT improved widecoverage by 41 points and quality by 48.81 points in human evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Template Extraction for a Bidirectional English-Filipino Machine Translation System

A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vice versa. The system implements the similarity template learning algorithm perf...

متن کامل

مدل ترجمه عبارت-مرزی با استفاده از برچسب‌های کم‌عمق نحوی

Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...

متن کامل

The Best Templates Match Technique For Example Based Machine Translation

It has been proved that large-scale realistic Knowledge Based Machine Translation (KBMT) applications require acquisition of huge knowledge about language and about the world. This knowledge is encoded in computational grammars, lexicons and domain models. Another approach – which avoids the need for collecting and analyzing massive knowledge-is the Example Based approach, which is the topic of...

متن کامل

A Full - Text Experiment in Example - Based MachineTranslationSergei

This paper describes an experiment in example-based machine translation (EBMT) on full text. The unit of translation is a text chunk of arbitrary length, in contrast to sentence-level EBMT experiments. Intra-and inter-language matching techniques and metrics used in the experiment are described.

متن کامل

Template-Based English-Filipino Machine Translation System

This paper presents a template-based machine translation system that extracts templates from a given bilingual corpus, then uses these templates to perform bi-directional EnglishFilipino translations. The system extended the similarity template learning algorithm of Cicekli and Guvenir [2] by refining existing templates and deriving templates from previously learned chunks. Chunk alignment and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011